Frivolous Units: Wider Networks Are Not Really That Wide

نویسندگان

چکیده

A remarkable characteristic of overparameterized deep neural networks (DNNs) is that their accuracy does not degrade when the network width increased. Recent evidence suggests developing compressible representations allows complexity large to be adjusted for learning task at hand. However, these are poorly understood. promising strand research inspired from biology involves studying unit level as it offers a more granular interpretation mechanisms. In order better understand what facilitates increases in without decreases accuracy, we ask: Are there mechanisms by which control effective complexity? If so, how do depend on architecture, dataset, and hyperparameters? We identify two distinct types “frivolous” units proliferate network’s increases: prunable can dropped out significant change output redundant whose activities expressed linear combination others. These imply constraints function computes could them. also development influenced architecture number training factors. Together, results help explain why DNNs increased highlight importance frivolous toward understanding implicit regularization DNNs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Are Social Networks Really Balanced?

There is a long-standing belief that in social networks with simultaneous friendly/hostile interactions (signed networks) there is a general tendency to a global balance. Balance represents a state of the network with lack of contentious situations. Here we introduce a method to quantify the degree of balance of any signed (social) network. It accounts for the contribution of all signed cycles ...

متن کامل

Are Nonsignificant Differences Really Not Significant?

Many researchers and reviewers hesitate to publish data when treatments are not “significantly” different. Most of us were taught to use a particular probability value (usually 5% or 1%) as a critical value to test hypotheses. The term “significant” has come to be synonymous with the phrase “significant at the 5% level.” Using relatively conservative critical values prevents readers from evalua...

متن کامل

MULTIPLICATION MODULES THAT ARE FINITELY GENERATED

Let $R$ be a commutative ring with identity and $M$ be a unitary $R$-module. An $R$-module $M$ is called a multiplication module if for every submodule $N$ of $M$ there exists an ideal $I$ of $R$ such that $N = IM$. It is shown that over a Noetherian domain $R$ with dim$(R)leq 1$, multiplication modules are precisely cyclic or isomorphic to an invertible ideal of $R$. Moreover, we give a charac...

متن کامل

Peer-Selected “Best Papers”—Are They Really That “Good”?

BACKGROUND Peer evaluation is the cornerstone of science evaluation. In this paper, we analyze whether or not a form of peer evaluation, the pre-publication selection of the best papers in Computer Science (CS) conferences, is better than random, when considering future citations received by the papers. METHODS Considering 12 conferences (for several years), we collected the citation counts f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i8.16853